Scaling regression inputs by dividing by two standard deviations.

نویسنده

  • Andrew Gelman
چکیده

Interpretation of regression coefficients is sensitive to the scale of the inputs. One method often used to place input variables on a common scale is to divide each numeric variable by its standard deviation. Here we propose dividing each numeric variable by two times its standard deviation, so that the generic comparison is with inputs equal to the mean +/-1 standard deviation. The resulting coefficients are then directly comparable for untransformed binary predictors. We have implemented the procedure as a function in R. We illustrate the method with two simple analyses that are typical of applied modeling: a linear regression of data from the National Election Study and a multilevel logistic regression of data on the prevalence of rodents in New York City apartments. We recommend our rescaling as a default option--an improvement upon the usual approach of including variables in whatever way they are coded in the data file--so that the magnitudes of coefficients can be directly compared as a matter of routine statistical practice.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pearson’s Correlation Tests (Simulation)

The Pearson correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. It is the slope of the regression line between two variables when both variables have been standardized by subtracting their means and dividing by their standard deviations. The correlation ranges between plus and minus one. The population correlation ρ is ...

متن کامل

A Comparison of Thin Plate and Spherical Splines with Multiple Regression

Thin plate and spherical splines are nonparametric methods suitable for spatial data analysis. Thin plate splines acquire efficient practical and high precision solutions in spatial interpolations. Two components in the model fitting is considered: spatial deviations of data and the model roughness. On the other hand, in parametric regression, the relationship between explanatory and response v...

متن کامل

FUZZY LINEAR REGRESSION BASED ON LEAST ABSOLUTES DEVIATIONS

This study is an investigation of fuzzy linear regression model for crisp/fuzzy input and fuzzy output data. A least absolutes deviations approach to construct such a model is developed by introducing and applying a new metric on the space of fuzzy numbers. The proposed approach, which can deal with both symmetric and non-symmetric fuzzy observations, is compared with several existing models by...

متن کامل

Scaling, normalizing, and per ratio standards: an allometric modeling approach.

The practice of scaling or normalizing physiological variables (Y) by dividing the variable by an appropriate body size variable (X) to produce what is known as a "per ratio standard" (Y/ X), has come under strong criticism from various authors. These authors propose an alternative regression standard based on the linear regression of (Y) on (X) as the predictor variable. However, if linear reg...

متن کامل

Car paint thickness control using artificial neural network and regression method

Struggling in world's competitive markets, industries are attempting to upgrade their technologies aiming at improving the quality and minimizing the waste and cutting the price. Industry tries to develop their technology in order to improve quality via proactive quality control. This paper studies the possible paint quality in order to reduce the defects through neural network techniques in au...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Statistics in medicine

دوره 27 15  شماره 

صفحات  -

تاریخ انتشار 2008